-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use bpf_sk_assign at tproxy_wan_ingress #383
Conversation
Thanks for your high-quality PR. |
0b16e50
to
7604753
Compare
由于 main 还没合并 #386 , 我在自己的 fork 上运行了全套测试,居然全部通过了! https://github.com/jschwinger233/dae/actions/runs/7369675519/job/20055254128?pr=5 接下来我整理一下就可以正式 review 了,最迟明天,现在要去博德之门启动了! |
This comment was marked as off-topic.
This comment was marked as off-topic.
f016441
to
ac3642f
Compare
Label ready for review. Recommend rebasing main after merging #386 to double check. |
IPv6 UDP 现在状态是被内核在 __udp6_lib_rcv diff --git a/control/kern/tproxy.c b/control/kern/tproxy.c
index 688fd3c..9dc71e4 100644
--- a/control/kern/tproxy.c
+++ b/control/kern/tproxy.c
@@ -1810,8 +1810,6 @@ int tproxy_wan_egress(struct __sk_buff *skb) {
// }
// }
- disable_l4_checksum(skb, link_h_len, l4proto, ihl);
-
// Redirect from egress to ingress.
if ((ret = bpf_redirect(skb->ifindex, BPF_F_INGRESS)) == TC_ACT_SHOT) {
bpf_printk("Shot bpf_redirect: %d", ret); 就能修好。 但是如 PR description 里所述,由于 ci 测试是 assert failure,所以等这个合并后单独提交一个 pr,避免破坏 ci。 建议顺序: #386 -> this pr -> fix IPv6 UDP |
As we are going to implement tproxy hijack via bpf_sk_assign, tproxy response won't reach wan iface at all, unless wan iface == lan iface. The only remaining "tproxy_response" is the place returning TC_ACT_PIPE to hand packets over from tproxy_wan_egress to tproxy_lan_egress. This commit also deletes rev-NAT logic for tproxy response. This commit tries to make a minimum change, otherwise file diff is too confusing to reviewers. I'll clean it up in the next patch.
This commit merely removes the `if (false)` branch at: ``` if (false) { // Comments } else { ... } ``` The file diff becomes completely messed up, so I split it into a separate patch without any functional change.
Note the necessity of separation of `assign_socket_tcp` and `assign_socket_udp`: As `struct bpf_sock *` has different verifier types for tcp and udp, the code below can't pass verifier: ``` static __always_inline int assign_socket(struct __sk_buff *skb, struct bpf_sock_tuple *tuple, __u32 len, __u8 nexthdr) { struct bpf_sock *sk; switch (nexthdr) { case IPPROTO_TCP: sk = bpf_sk_lookup_tcp(skb, tuple, len, BPF_F_CURRENT_NETNS, 0); case IPPROTO_UDP: sk = bpf_sk_lookup_udp(skb, tuple, len, BPF_F_CURRENT_NETNS, 0); } if (!sk) { return -1; } int res = bpf_sk_assign(skb, sk, 0); bpf_sk_release(sk); return res; } ```
We no longer need tcp_dst_map for NAT. Relevant Golang logic is also removed. One thing need to mention is "dst_routing_result" struct. Although tcp_dst_map is gone, dst_routing_result struct is still in use under userspace at https://github.com/daeuniverse/dae/blob/cab1e4290967340923d7d5ca52b80f781711c18e/control/udp.go#L69C17-L69C17. Therefore, this commit remains this struct and make some efforts to ensure bpf objects are compiled with it.
Previously, wan_egress has to encap UDP packets with routing info, but it's no more necessary as we are in favor of bpf_sk_assign without NAT.
This is a must-have, otherwise packets being bpf_sk_assigned and routed to local on wan will be dropped by kernel during fib_lookup: ``` // https://github.com/torvalds/linux/blob/v6.5/net/ipv4/fib_frontend.c#L381 static int __fib_validate_source() ... if (res.type != RTN_UNICAST && (res.type != RTN_LOCAL || !IN_DEV_ACCEPT_LOCAL(idev))) goto e_inval; ... ```
ac3642f
to
9e34149
Compare
9e34149
to
8f96bbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff. Thanks for your contribution! Appreciate the efforts.
Please feel free to merge. @jschwinger233 |
With daeuniverse#383, dae no longer encap UDP, leaving this decap UDP logic obsolete. Removal of lan_egress also ends up with several helper functions unused, this commit deletes them as well. As for scenario of LAN==WAN, after this commit only wan_egress will be attached on LAN. Wan_egress therefore will pass all traffic sent from dae because pid_is_control_plane() gives true, so it looks like no concerns.
With daeuniverse#383, dae no longer encap UDP, leaving this decap UDP logic obsolete. Removal of lan_egress also ends up with several helper functions unused, this commit deletes them as well. As for scenario of LAN==WAN, after this commit only wan_egress will be attached on LAN. Wan_egress therefore will pass all traffic sent from dae because pid_is_control_plane() gives true, so it looks like no concerns.
Closes: #375
废弃 tcp_dst_map,废弃 NAT 模式,全面使用 routing_tuple_map,删除兼容代码。
废弃 udp_encap, 部分解决 IPv6 UDP 的问题.
当前状态: